Search CORE

75 research outputs found

Language Understanding for Text-based Games Using Deep Reinforcement Learning

Author: Barzilay Regina
Kulkarni Tejas
Narasimhan Karthik
Publication venue
Publication date: 01/01/2015
Field of study

In this paper, we consider the task of learning control policies for text-based games. In these games, all interactions in the virtual world are through text and the underlying state is not observed. The resulting language barrier makes such environments challenging for automatic game players. We employ a deep reinforcement learning framework to jointly learn state representations and action policies using game rewards as feedback. This framework enables us to map text descriptions into vector representations that capture the semantics of the game states. We evaluate our approach on two game worlds, comparing against baselines using bag-of-words and bag-of-bigrams for state representations. Our algorithm outperforms the baselines on both worlds demonstrating the importance of learning expressive representations.Comment: 11 pages, Appearing at EMNLP, 201

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Robust Leader Election in a Fast-Changing World

Author: Augustine John
Kulkarni Tejas
Nakhe Paresh
Robinson Peter
Publication venue: 'Open Publishing Association'
Publication date: 01/10/2013
Field of study

We consider the problem of electing a leader among nodes in a highly dynamic network where the adversary has unbounded capacity to insert and remove nodes (including the leader) from the network and change connectivity at will. We present a randomized Las Vegas algorithm that (re)elects a leader in O(D\log n) rounds with high probability, where D is a bound on the dynamic diameter of the network and n is the maximum number of nodes in the network at any point in time. We assume a model of broadcast-based communication where a node can send only 1 message of O(\log n) bits per round and is not aware of the receivers in advance. Thus, our results also apply to mobile wireless ad-hoc networks, improving over the optimal (for deterministic algorithms) O(Dn) solution presented at FOMC 2011. We show that our algorithm is optimal by proving that any randomized Las Vegas algorithm takes at least omega(D\log n) rounds to elect a leader with high probability, which shows that our algorithm yields the best possible (up to constants) termination time.Comment: In Proceedings FOMC 2013, arXiv:1310.459

arXiv.org e-Print Archive

Directory of Open Access Journals

Practical Differentially Private Hyperparameter Tuning with Subsampling

Author: Koskela Antti
Kulkarni Tejas
Publication venue
Publication date: 04/06/2023
Field of study

Tuning the hyperparameters of differentially private (DP) machine learning (ML) algorithms often requires use of sensitive data and this may leak private information via hyperparameter values. Recently, Papernot and Steinke (2022) proposed a certain class of DP hyperparameter tuning algorithms, where the number of random search samples is randomized itself. Commonly, these algorithms still considerably increase the DP privacy parameter

\varepsilon

over non-tuned DP ML model training and can be computationally heavy as evaluating each hyperparameter candidate requires a new training run. We focus on lowering both the DP bounds and the computational cost of these methods by using only a random subset of the sensitive data for the hyperparameter tuning and by extrapolating the optimal values to a larger dataset. We provide a R\'enyi differential privacy analysis for the proposed method and experimentally show that it consistently leads to better privacy-utility trade-off than the baseline method by Papernot and Steinke.Comment: 26 pages, 6 figure

arXiv.org e-Print Archive